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This paper presents a mathematical analysis of an adaptive quantizer, 
a pulse code modulator, which is used for coding speech and other continu- 
ous signals with a large dynamic range into digital form. The device is a 
two-bit quantizer in which the step size is modified at every sampling instant 
with the object of adapting the range of the device to the intensity level of 
the signal. In the adaptation algorithm analyzed in the paper, the encoded 
information of the previous sampling instant is used either to increase 
or to decrease the step size by fixed, but not necessarily equal, proportions. 

Initially, the stochastic stability of the device is established by construct- 
ing a stochastic Liapunov function. Various basic identities and bounds 
on aspects of the behavior of the device are obtained. The qualitative 
results obtained indicate the nature of the trade-offs between the quality 
of the steady state and the transient performance of the device. Also, 
formidas are developed for the purpose of evaluating the mean time 
required for the step size to adapt from arbitrary initial conditions to 
certain optimal values. 

I. INTRODUCTION 

A mathematical analysis of an adaptive quantizer is presented in this 
paper. The coding thresholds of the device, also referred to as the step 
sizes, are not fixed but adapt according to a particular alogrithm. The 
object of the algorithm is to modify the threshold to larger or smaller 
levels, depending on whether the signal intensity level is high or low, in 
a manner that allows a decoder at the receiving end to effectively re- 
construct the continuous signal. The basic two-bit quantizer, i.e., quanti- 
zers with four output levels with codes 01, 00, 10, and 11, is character- 
ized by a particular function of the following form at each sampling 
instant. 
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Input refers to the nth sample of the continuous signal, x(n), 
n = 0, 1, 2, • • • ; output refers to the coded signal to be transmitted at 
that instant ; and A is the step size. In adaptive quantizers of the type 
to be investigated here, the step size is variable and the step size at the 
nth sampling instant is denoted by A(n). The step size uniquely defines 
the entire function in the manner indicated by Fig. 1 ; hence, the com- 
plete adaptive quantizer is associated with a sequence of functions. 
The adaptive quantizers that are the subject of this paper are basically 
characterized by the following adaptation algorithm 

A(n + 1) = MiA(n) if \x(n)\ ^ A(n) (la) 

= M 2 A(n) if \x(n)\ > A(n), (lb) 

where Mi and M 2 , called multiplier coefficients, are fixed constants 
satisfying* < Mi < 1 < M 2 . Variations on (1) are considered in 
the main text, although the discussion in the introductory section is in 
terms of (1). Results on adaptive quantizers with output levels more 
numerous than 4 will be considered in a future publication. 

The adaptation algorithm in (1) is due to Cummiskey, Flanagan, 
and Jayant. 12 In Ref. 1 Jayant presents the results of extensive com- 
puter simulations undertaken to determine the multiplier coefficients 
which maximize various performance functionals. A class of random 
inputs [x(n)\ that is considered is obtained by passing a discrete, 
white, Gaussian process through a filter with a single pole. In Ref. 2, 
Cummiskey, Jayant, and Flanagan consider a differential PCM coder 
in which the adaptive quantizer is used together with a fixed first- 
order predictor in the feedback loop. Their work has its direct ante- 
cedents in the various schemes 3 - 45 for adapting step sizes in delta- 
modulators, a one-bit quantizer, and in the work of Wilkinson. 6 Wilkin- 
son's paper on a two-bit adaptive quantizer, largely concerned with 
hardware implementation, is particularly interesting. In his scheme, the 
step size is controlled by a moving fraction obtained by keeping a tally 
of the number of times the input falls in the lower slot of the quantizer. 
Goodman and Gersho 7 have independently looked at the adaptive 
quantizer from a theoretical standpoint and their work complements 
the work described here. 

In this paper we make a number of simplifying assumptions about 
the input sequence {x(n)\, the most restrictive being the assumption 



* Since the absolute value of the input in Fig. 1 is partitioned into [0, A] and 
(A, oo 1 we shall loosely refer to the event leading to (la) as "the input falling in the 
lower slot." 
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Fig. 1 — The quantizer function. 

that it is a sequence of independent random variables. However, we 
have obtained for the idealized model precise results which indicate 
rather fully the trade-offs involved in the choice of the multiplier 
coefficients. Also, we have developed formulas for efficiently computing 
functionals as aids in the design problem. We believe that the broad 
qualitative features of the device that are found to hold in this model 
carry over for more realistic input processes. It is hoped too that the 
techniques developed here will provide a point of reference for future 
work. 

The mathematical analysis, for the main part, is of a random walk 
on the integers, whose complexity is due to the dependence of the state 
transition probabilities on the states. The structure of the random walk 
which is exploited here is rather general, and for this reason the model 
is of independent interest ; to our knowledge, the main mathematical 
results have not appeared in the literature on random walks. 

The organization of the paper is as follows. In Section 1.1 we con- 
tinue the discussion on the adaptation algorithm in the context of a 
particular idealized model of the sequence {x(n) }, and we discuss some 
of the results to be derived later and what is already known about 
optimal quantization in the nonadaptive framework. In Sections 1.2 
and 1.3 we give the basic equations of the process arising from (1), 
and certain modifications of it, when the input sequence {x(n)} is 
independent and identically distributed. In Section II the stochastic 
stability of the device is established under general conditions. The 
existence and uniqueness of the stationary distribution of the step 
size is proved by constructing a stochastic Liapunov function for the 
random process. Section III examines in detail the stationary step 
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size distribution. In Section 3.2 we prove an identity which explicitly 
gives the stationary probability of the input falling in the lower slot 
of the quantizer, i.e., Pr„ t\x(n)\ S A(n)]. In Section 3.3 sharp 
bounds are obtained on the stationary probabilities. It is shown that 
for almost all values of the multiplier coefficients there exists a natural 
center of the distribution and that the stationary probabilities fall off 
at least geometrically with increasing distances from the natural 
center. In Section 3.5 results are obtained on a particular limiting 
behavior, namely, the effect of the stationary distribution of making 
both multiplier coefficients close to unity. Section IV is devoted to the 
transient response of the device. In Section 4.1 we develop formulas 
for the efficient computation of the time required for the step size to 
adapt from an arbitrary initial value to the desired step size. Section 
4.2 by giving an explicit bound on this time provides some insight into 
the dependence of the adaptive time on the choice of the multiplier 
coefficients. Finally, we report some computational results. 

1.1 Background 

In an idealized model for the samples, x(n), of the continuous signal 
process, assume that \x(n) } is a sequence of independent random vari- 
ables with zero mean. Assume further that the distribution of x(n) 
for every n is an element of the same equivalence class of distributions 
in which the distributions are equivalent to within a scaling operation. 
The scaling or intensity level changes slowly with n. For instance, the 
equivalence class of distributions may be the family of Gaussian 
distributions and only the variance, indicating the intensity level, 
changes with n. 

It is necessary to recall at this stage some known facts concerning 
the design of quantizers in the nonadaptive framework 8 where \x(n)\ 
is a sequence of independent, identically distributed random variables 
and the step size is fixed. Suppose that E\_{y(n) - z(n)} 2 ] measures 
the performance of the quantizer where y(n) is the nth output of the 
device.* The step size which minimizes this functional, A, is in principle 
easy to establish, and A is uniquely characterized by the probability 
of the input falling in the lower slot, i.e., Pr l\x(n) | ^ A]. Another 
observation that is equally easy to verify is that the optimal step size 
has the property that if the distribution of [x(n)\ is scaled, then the 
optimal step size is obtained by an identical scaling of the previous 
optimal step size. A convenient way of stating this observation is: a 



* It is not essential that the performance functional be of that form. 
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property of the optimal step size that is invariant to scaling of the 
distribution of {x(n)} is the probability that the absolute value of the 
input x(n) does not exceed the optimal step size. For instance, when the 
distribution is Gaussian it is known that this probability is close to 
0.68. 8 

An intermediate step in proceeding from the nonadaptive case to the 
more general model described prior to it, in which the identically 
distributed condition does not hold, is provided by the following model. 
Assume that the sequence {x(n) } is indeed independent and identically 
distributed, and that the equivalence class of distributions to which 
the particular distribution belongs is known. However, the scaling 
parameter is unknown. It is relatively straightforward to state the 
requirements on a well-behaved algorithm operating in this simple 
framework, and, if these requirements are always satisfied, then it is 
possible to conclude that the device will operate satisfactorily for the 
more general model. The requirements are: (i) for arbitrary initial 
step size guesses, the step size rapidly converges to the optimal step 
size, and (ii) it is thereafter localized in a small neighborhood of that 
point. This paper separately analyzes the two requirements in the 
simple framework just described. Considerations related to (i) and 
(ii) are lumped respectively under the terms "transient response" and 
"steady-state response," since the latter property is effectively investi- 
gated in terms of the stationary distribution of the step size, assuming 
one exists. A good reason for the division is that they lead, in some 
ways, to quite opposite requirements for the multiplier coefficients. 

Consider, in the light of what is known about optimal quantization 
in the nonadaptive framework, what is required for the localization 
property, requirement (ii), to hold. When the stationary distribution 
has both of the following properties, it is possible to establish an effec- 
tive correspondence and infer that (ii) holds : (a) the stationary proba- 
bility of the step size falling in the lower slot, i.e., Pr» [|x(n) | ^ A] 
equals the known value associated with the particular family of dis- 
tributions; and (b) the mass of the stationary distribution is concen- 
trated in the small neighborhood of a point. In Section III we show that 
by appropriate choice of the multiplier coefficients it is possible to 
achieve both requirements. 

1.2 Basic assumptions and equations 

We consider only quantizers with multiplier coefficients having the 
following structure : 

M x = y- k and M 2 = y\ (2) 
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where 7 is some real number greater than 1 and k and I are positive 
integers. We shall further make k and I relatively prime, i.e., their 
greatest common factor is 1 . If , as we shall assume, the initial step size 
is of the form 7*, with i an integer, then the step size is always of that 
form and the space of possible step sizes forms a lattice. * 

There is a step size with, as we shall see, certain claims to being the 
central step size for a particular distribution of [x(n)\ and choice of 
parameters k and I; this step size is used as a reference point. There 
exists an integer i such that f 

Pr [ I x(n) I ^ 7'- 1 ] < j^pi £ Pr [ I *(») I ^ 7*1- (3) 

We denote 7* by C and refer to it as the central step size; all step sizes 
are considered to be of the form Cy i , i = 0, ±1, ±2, 

Obviously, it is more convenient to work with the log transform of 
the step size, so let 

«(n) = log 7 A(n) - log 7 C. (4) 

From the original algorithm we have 

u>(n + 1) - w(n) - k if \x(n) \ ^ C T u(n) 

= u>{n)+l if \x{n)\ > C7" (n) . (5) 

We have in (5) a Markov chain with states 0, ±1, ±2, • • •. The state 
transition probabilities are obtained from the distribution of x(n): 
for all integers i let 

&<^ Pr[|z(n)| ^ Cy] (6) 

and 

a, = 1 — b{. 

The "b" is a mnemonic for backward probabilities since it is associated 
with a transition backwards from the generic state i to (i — k). The 
diagram in Fig. 2 represents the Markov chain. Denoting by p,-(n) 
the probability that co(n) = i, we have 



Pi(n + 1) = b i+k pi+k(n) + di-ipi-i(n). 



(7) 



Although the transition probabilities depend on the distribution of 
x{n), the two following properties of the sequence {&,}, on which we 



'D.J. Goodman suggested the above structure on the multiplier coefficients with 
the object of obtaining a discrete Markov process. 

+ We are tacitly assuming that Pr [|i(n) | = 0] ^ l/{k + - e, e > 0. 
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Fig. 2 — The Markov chain. 

base our results, hold irrespective of the distribution : 

^ bi < bi+i ^ 1 for all i, (8) 



and 



>- < tt* »- W 



That the strict inequality in (8) holds for all i is a mild restriction on 
the distribution of x(n) ; however, certain straightforward modifications 
may be made to obtain corresponding results when the strict inequalit}' 
does not hold for all i. 

The property of the state to which we alluded earlier may be loosely 
stated, thus : there is a net drift to the left (right) from states to the 
right (left) of the state. Formally, 



5[«(n+l)|«(n)-*]-t--(*+0 hi- ^L-l<0if i>0 



>0if *<0. 



(10) 



The above super- and submartingale properties are the basis for the 
existence of a stochastic Liapunov function (Section 2.2) and the 
bound obtained in Section 4.2. 

Remarks: The random walk in (5) with k = I = I is also the model 
for the delta-modulator subject to random, independent, identically 
distributed inputs. The stationary behavior of the model was treated 
in an elegant paper by Fine. 9 Gersho 10 has established the stochastic 
stability of the delta-modulator for a larger class of input processes. 
Some of our results, particularly those in Section IV on transient re- 
sponse, appear to be new and of some interest in this context. 

7.3 The saturating adaptive quantizer 

For the algorithm in (1) and, say, Gaussian distributions of the 
input, there is a small, positive probability of the step size exceeding 
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any large prespecified level. A model which reflects more accurately 
the practical algorithm for adapting the step size is one which does not 
allow the step size to become unbounded. One way of implementing 
this is to make the step size saturate at some suitably large level, 
i.e., if A(n) < |&(n)|, then 

A(n + 1) = min [M 2 A(n), L]; L » 0; (11) 

i.e., in the log transformed variables, 

u(n + 1) = min |>(n) + I, L]\ L » 0. (12) 

The model of this device, which we shall refer to as the saturating 
adaptive quantizer, is useful not only for the reasons given but also 
on theoretical grounds since the results obtained for the saturating 
adaptive quantizer yield, in the limit as L — ► «j , corresponding results 
for the adaptive quantizer. We carry both models with us throughout 
the paper and at least indicate along the way the main correspondences. 

For similar reasons we expect that in practice the step size will also 
be bounded from below in the obvious manner. This case is not for- 
mally dealt with in the text since the main results may be readily 
inferred from the saturating adaptive quantizer. 

For the saturating adaptive quantizer, the following equations 
govern the evolution of \pi(n) = Pr [w(n) =*']}, i ^ L: 



Pi(n + 1) = b i+k Pi+k(n) + a,i-ipi-i(n) i ^ L - k 
Vi {n + 1) = a t -tpi-i(n) L- k + l ^i^ L- 1 

L 

PL(n + 1) = £ a,jPj{n). 

j-L-l 



(13) 



The important super- and sub-martingale properties of the random 
walk, as expressed by the inequalities in eq. (10), apply as well to the 
saturating adaptive quantizer. 

II. THE EXISTENCE AND UNIQUENESS OF THE STATIONARY DISTRIBUTION 

We examine in this section questions related to the stochastic 
stability of the adaptive quantizer. We establish theoretically that 
certain acute types of erratic operations such as the unboundedness of 
the evolving random variable, namely, the step size, do not occur. We 
begin by establishing that the process has the basic properties of a 
well-behaved process, namely, irreducibility and recurrence. We 
thereby establish the existence and uniqueness of a finite stationary 
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distribution. We then proceed to the saturating adaptive quantizer, 
the more realistic model of the adapting algorithm, which in addition 
to the above properties, is also aperiodic. Here, the entire state space 
is a single ergodic class. The main result of this section is obtained from 
the construction of a stochastic Liapunov function for the process; and 
the theory of stochastic Liapunov functions is fairly well known. 11 - 12 

2.1 Irreducibility of the Markov chain 

The chain is irreducible if and only if every state communicates 
with both the neighboring states. This occurs if and only if there 
exists nonnegative integers m, m', n, n' such that 

ml - nk = 1 (14a) 

and 

m'l - n'k = - 1. (14b) 

It is an elementary fact from number theory that this occurs if and 
only if k and I are relatively prime, i.e., their greatest common divisor 
is unity. In fact, Euclid's algorithm yields the unknown quantities 
in eq. (14). 

2.2 Recurrence 

Consider the following nonnegative function of the states : 

V(i) = \i\ i = 0, ±1, •••. (15) 

This function is a stochastic Liapunov function 12 if the following 
holds: if D(i) is defined as follows, 

ElV[u(n + 1)} |«(n) = f\ - V(i) = D(i), (16) 

then (i) D(i) is uniformly bounded from above and (u) D(i) ^ - « < 
for all but a finite set of states i. Condition (i) is trivially true for the 
process. Also, for all i ^ k 

»« = - <* + o ( b < - rh) * - <■" + « ( 6 < - rh) < ° (17) 

and, for all i ^ — /, 

m -(* + *) (*< - T+n) * o + 8 (»-. - rh) < °- (18) 

Therefore, condition (ii) is verified, and V(i) is a stochastic Liapunov 
function for the process. 
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From Kushner's Theorem 7 12 we have recurrence* and we can infer 
further, from Theorem 4, that there exists at least one finite invariant 
measure, i.e., stationary distribution. Also, as we have shown earlier 
there does not exist two or more disjoint self-contained subsets of the 
state space ; hence, we have from Theorem 5 that there is at most one 
invariant probability measure. Thus, the existence and uniqueness of a 
finite stationary distribution for the step size of the adaptive quantizer 
is established. 

2.3 The saturating adaptive quantizer 

We will circumvent the technical nuisance* posed by periodicity by 
proceeding to the saturating adaptive quantizer. In this case the above 
arguments leading to irreducibility and recurrence are intact. In 
addition, the end state L has period 1 and, since periodicity is a 
class concept (i.e., every state in a particular communicating class 
has the same periodicity), the entire Markov chain is aperiodic. We 
have, then, p(n) -> p for any p(0) and p,- > for all i. Also, the state 
space is a single ergodic class. Hence, the statistical average of the 
step sizes approach a limit given by the unique, finite, stationary 
distribution. 

III. SOME PROPERTIES OF THE STATIONARY DISTRIBUTIONS 

In this section we investigate in detail properties of the stationary 
distribution of the step size. Ineq. (7) if we set p,(n + 1) = Pi{n) = p it 
then the stationary distribution is given by {p,}. Thus, the stationary 
probabilities are the solutions of 



Pi = bi+kPi+k + cii-iPi-i 



(19) 



with, of course, the normalization, 

£ Pi = 1. (20) 

—00 

For the saturating adaptive quantizer, we have from eq. (13) that 
the basic recursion in (19) holds for all i ^ (L - k). The remaining 



* A Markov chain is recurrent if and only if every state is recurrent; and state i 
is recurrent if and only if, starting from state i, the probability of returning to state 
i after some finite length of time is one. 

+ Feller 13 writes: "The classification into persistent and transient states is funda- 
mental, whereas the classification into periodic and aperiodic states concerns a 
technical detail." 
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equations are (20) and the following: 

Pi = di-ipi-i L-k+l^i^L-1 (21a) 

L 

Pl = E ajpj (21b) 

L-l 

and, of course, pi = 0, i > L. 

3.1 A useful reduction of the equations for the stationary probabilities 

To provide some insight into the motivation for the step we under- 
take here, consider the recursion, analogous to (19), that would arise 
from a Markov chain with uniform transition probabilities : 

Pi = bp i+k + api-i, a + b = 1. (22) 

A particular solution of the above recursion is p, = c, a constant. Since, 
in probability theory, interest is restricted to solutions with bounded 
sums, one would proceed in the case of (22) by factoring the root at 
unity from the characteristic polynomial: 

b\ k+1 - X 1 + a = 0, 

and thus obtain a new, and reduced, polynomial and an associated 
recursion. This operation is paralleled for the more general recursion 
in (19) by the following: from (19), 

Pi — Pi-i = b i+k p i+ k — bi-ipi-i. 
Hence, for all j, 

i j 

E (Pi — Vi-i) = E (bi+kPi+k — bi-ipi-i) (22a) 

—00 —00 

which reduces to 



(23) 



Remarks: 

(i) Observe that we are justified in carrying out the operation in 
(22a) in the case of solutions of (19) for which E-« Pi is bounded and 
which we have established, in Section II, to be the case for the station- 
ary probabilities. 

(it) The reduction alluded to earlier refers to the fact that the largest 
difference in variable indices in (23) is k + I, while the largest differ- 
ence in (19) is A; + I + 1. 
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jW+l 







(Hi) Observe that when fc = I = 1, (23) gives the solution in closed 
form: pj+% = (a y /6 i+ i)p; and £p> = 1. This is a previously known 
fact; see Feller 14 and Fine. 9 However, neither author gave any indica- 
tion of the possible generalization to the form in (23). 

For the saturating adaptive quantizer, (23) holds for all j ^ (L — fc) . 
Hence, the range over which (23) is valid is such that every state is 
included in at least one component of the recursion. 

3.2 An Identity involving the stationary distribution 

We use eq. (23) to show that the stationary probability of the nth 
input sample, x(n), falling in the lower slot, Pr, [|x(n)| = A (to)] 
= l/(k + I). The significance of this identity from the point of view 
of optimal steady-state operation (see Section 1.1) is that by appro- 
priate choice of k and I the above quantity may be matched to the 
corresponding probability for the optimal nonadaptive step size. This, 
of course, has the effect of locating the central step size, eq. (3), close 
to the optimal nonadaptive step size. In the case of independent 
Gaussian inputs, the above quantity is close to 0.68 and a reasonable 
approximation is obtained by making k = 1 and 1 = 2. 

From (23), 



Hence, 



j+k 3 

E bipi = E p, 
j-i+i j-i+i 



E E hPi= E E Pi- (24) 

/=— oo i=/-J+l j=— oo t— y— J+l 



The left-hand side equals (k + I) E-« hPi> while the right-hand side 
equals I. Hence, 






(25) 



Consider what the above equality implies in terms of step size 
behavior. The stationary probability of the input falling in the lower 
slot, 

Pr. [\x(n)\ = A(w)] = E Pr. [A = <V and \x\ = Cy^ 

= f &,Pr.[A = Cy] (26) 

i=— oo 
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from the independence of [x(n)}. Hence, from (25), 

Pr.[|*(n)| ^ A(n)] = ^- Z - (27) 

Immediately on substituting Mi = y~ k and M 2 = y l we have an 
identity with a rather appealing and natural interpretation*: 

Ml'Ml 1 = 1 (28) 

where p\ and p 2 are respectively the two stationary probabilities of 
the input falling in the lower and upper slots. 

For the saturating adaptive quantizer, it can be shown that 

£.*>#*< bin' < 29 > 

However, the quantity [_{l/k -f I) — £ 6,p t ] depends only on (k + I) 
terms involving the end probabilities pl, • ■ ■, PL-k-i and it goes to 
zero with these probabilities. Now we will prove in Section 3.3 certain 
results which indicate that these probabilities are relatively small if 
L is large. 

3.3 Geometric bounds on the stationary probabilities 

In this section we prove a fundamental property of the stationary 
distribution of the step size which holds for all values of y. We obtain 
sharp bounds on almost all of the stationary probabilities — the bounds 
apply as well to the saturating adaptive quantizer — which show that 
the stationary probability of the random walk being in a particular 
state falls off at least geometrically with the distance of that state from 
the state. The actual bounds obtained are substantially stronger and 
they indicate that a localization property on the stationary distribu- 
tion is inherent for the random walk. As discussed in Section 1.1 this 
localization property is important in understanding the basis for the 
satisfactory behavior of the adaptive quantizer. 

We obtain the following point-wise bound: for every i > we give 
positive constants r > 1 and c such that for all j ^ i, 

The quantities r and c depend on i. The quantity r which we call the 

* D. J. Goodman first conjectured the existence of (28) in the context of the adap- 
tive quantizer. Earlier, N. S. Jayant 5 made a related conjecture in connection with 
an adaptive delta-modulator. 
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local steepness factor is a monotonic increasing function of i for non- 
negative i. Of course, a corresponding result holds for i < and all 

3 ^ *'f 
Let Pi denote the (k + I — l)-dimensional column vector* with the 

following components 

Pi - [p*, Pi+h "', Pi+k+i-*]'. (31) 

Then, from (23), we obtain (k + I - 1) X (k + I - 1) transition 
matrices A,-, where 

P, + i = A.P., (32) 

The leading (fc + I — 2) components of P<+i are obtained from P, by 
merely shift operations. The nontrivial information in A,- is in the 
last row which is obtained from (23) ; clearly, A,- depends on i. 

We will show that there exist a constant weight vector 7., every 
element of 3i being positive, and a constant r > 1 depending only on 
A,-, such that for all j ^ i 

VAT 1 ^ fX* (33) 

in the sense that every element of the left vector is not less than the 
corresponding element of the right vector. Since P/ + i is a vector with 
nonnegative elements, we have 

rVP j+ i ^ aMj^Py+i = a-'Py. (34) 

Hence, 



VPi ^ (I}' *(3i'P<) J fc i- 



(35) 



Remarks: Equation (35) is a strong result if VP, is viewed as a norm of 
the vector P y of the Li-type: |x| = L X*|**|i which is a valid inter- 
pretation since the latter reduces to Vx whenever every element of x 
is nonnegative. By standard methods we can obtain upper bounds for 
P, in norms other than the one used in (35). In particular, (30) follows 
trivially. 

It is necessary now to discuss the structure of the matrix Af 1 . 
Directly from (23) we obtain the first row:t 

(I — 1) terms k terms 



[_ ?»'+* _ a »+ 2 —dj+i-i fot+i bj+i+i bj+i+k-i I 

at ' a, ' "' a, ' a t ' a { ' '"' en \' 



' The superscript t denotes the transpose. 

f Observe that neither A, nor A*" 1 is a stochastic matrix (nonnegative elements, 
columns sum to unity). 
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The remaining rows of A f ! reflect shift operations: for m = 2, 3, • • •, 
(* + I - 1), 

(AT 1 )*- = if n * (m - 1) 

= 1 if n = (m — 1). 

Before proceeding to prove (33) we need the following lemma. 
This lemma concerns the matrix A, -1 which is obtained from Af 1 by 
merely replacing the first (I - 1) elements of the first row by —1. 

Lemma 1 : For every i ^ 

(i) A* -1 has a unique positive real eigenvalue r, say. Furthermore, 

r > 1. 
(it) Every element of the corresponding left eigenvector a, is of 
the same sign and nonzero; hence, i may be taken to be a 
positive vector. 
(Hi) r, which depends on i, is monotonic, strictly increasing with i. 

We give the proof of Lemma 1 in Appendix A. 

We need one further observation to prove (33) with the help of the 
lemma. For j ^ i, 

VAT 1 = VCAy- 1 - AT 1 ) + VAf 1 
= V(Ar - AT 1 ) + rV. 

The bound in (33) follows if ^'(Aj -1 - Af 1 ) k 0. Since * is a positive 
vector it is sufficient to show that the elements of the matrix 
(A]~ l - AT 1 ) are nonnegative. The only nonzero elements of the 
matrix (A J 1 - Af 1 ) are in the first row. That every term of the first 
row is nonnegative is implied by the following: for s ^ 1 

I __ to > o (36) 

a 3 

and 

bj+s _ bj+s ^ Q (37) 

a, a,- 

This concludes the proof of (33) and, hence, of (35). 

Remarks: 

(i) The reader may now appreciate the reason for replacing some of 
the elements of Af 1 by -1 to form AT 1 : ay+.M although bounded by 
1 can come arbitrarily close to 1. 
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The reader is also due an explanation for our having worked with 
Ay" 1 after denning the natural transformation Ay, especially since 
(34) may be put in the form V[I - rAy]Py ^ 0. The reason is that 
r and X, depending only on i, do not exist such that for ;' ^ i, 
3i'[I — rAy]^0, although, as we have shown, X and r do exist such 
that V[l — rkj~\kj x ^ 0. In working this step the assumption of 
Py+i ^ 0, rather than Py ^ 0, appears to be critical. 

(tt) The interesting quantity r = r(i) may reasonably be called the 
local steepness factor, since for i ^ it is a local measure of the rapid- 
ness with which the stationary distribution falls off. From statement 
(Hi) of the lemma we have the fact that the distribution tends to get 
steeper with increasing distances from the natural center of the dis- 
tribution, the state. 

(in) The theoretical interest in the inequality in (35) results from 
the fact that we cannot expect to obtain a significantly better value 
than r for the geometric factor in geometrical bounds on p t for all 
j £ i. The reason for this is that by making b j+ i very close to by over a 
fairly large set of fa, it is possible to make the solution of (23) close to 
the stationary probabilities of a random walk with uniform transition 
probabilities, which in turn may be obtained in terms of r as the unique 
positive real root of the characteristic polynomial C(p) given in 
eq. (56), Appendix A. 

(iv) From symmetry we expect results similar to (35) to hold for 
i < 0. Perhaps the simplest way to show this is by means of the follow- 
ing transformations which have the effect of making the direction of 
decreasing i the forward direction. Let 

p'-i = p^ b'-t = a i} a'-t = &,-. 

The basic recursion (23), stated in terms of the new variables, is 

y+i j-k+i 

Now {b'i} is a monotonic, increasing sequence with i and i > => lb t 
> ka'i. (Observe the interchange of I and k, i.e., V = k and k' = I.) 
This transformation makes the transfer of results holding for i > 
to % < fairly straightforward. 

(v) In considering the application of (35) to the saturating adaptive 
quantizer we note that the basic recursion (23) holds over the entire 
range of states, i.e., (23) holds for all j ^ L - k. Hence, (35) holds for 
L— (l + k)+2^j^i^0. This observation is the basis for a 
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statement made earlier in Section 3.2, namely, we expect the tail 
probabilities of the stationary distribution of the step size for the 
saturating adaptive quantizer to be small. 

From (35) we obtain a rather simple point-wise bound on the 
stationary probabilities. Let X m denote the largest element of the vector 
X. Clearly,' 

VP, ^ X m l'P,-, 

and, hence, from (35), for all j ^ i ^ 



i.e., 



Hence, 



P«^i(j)' '(l' p .). 



P y + * + i- S ^ ( ; ) ' '(l'P.) *($)"* J^i^O, 



(38) 



where r = r(i). 



3.4 Lower bounds on the steepness factors, r(i) 

We have associated with every state i a local steepness factor r(i). 
Here we go back to the definition of r(i) as being the unique positive 
root of the polynomial C(m), eq. (56), to obtain the following bound 
which has the advantage of being explicit. 






?:^o. 



(39) 



Observe that p(i') > 1 for all i > and itself forms a monotonic 
increasing sequence with i. To prove (39) it is enough to show that 
C[(/c6,/Za,)] ^ 0. The proof is straightforward but tedious and we 
omit it. 

3.5 The effect of y on the stationary distribution 

We show here that the mass of the stationary distribution of the 
step size can be localized about the central step size to an arbitrary 
extent by making y sufficiently close to unity. To do this, we first put 



* The column vector with every element equal to unity is denoted by 1. 
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together from the results of the preceding sections a rather explicit 
bound on the stationary probability of the step size exceeding a par- 
ticular value for a given 7, i.e., Pr, [A ^ Cty*]. This bound is in a form 
which allows direct comparison with the corresponding probability 
arising from the choice of 7' = V7. By successively taking 7 to be 
the square root of the preceding value, the bound on the probability 
can be made as small as desired. As before, we shall restrict our at- 
tention to step sizes which exceed the central step size, i.e., i > since 
a parallel argument holds for i < 0. 

For i > and r = r(i), we have from (35) that 

(SX.) £ p^EVP^^lf-V^'P.-^- (40) 
j=i+k+i-2 j=i ]=o\r/ r 1 

Now, as in (39), 



/(*+!-!) 



and 
Since 



* ( P< r 1 

— - s max Ipi, • • • , Pi+k+i-2j- 



Pr. [A ^ C 7 ,+fc+z - 2 ] = Y, V> 



we have, from (40), 



Pr. [A ^ Cy i+k+l ~ 2 2 ^ f -S % l i max [p<, • • •, p<+k+i-s]. 
p\i) 1 



FinaUy, from (38), for i £ fc + I - 1, 



max 



/ 1 \ •-*-'+! 



(41) 



(42) 



Equations (41) and (42) give the bound for the mass of the distri- 
bution to the right of a particular state, which we shall now compare 
with a similar bound that holds for y' = V7. The prime superscript 
will be used on symbols to denote the functional dependence of the 
associated quantities on 7'. In establishing the reference (central) step 
size [see eq. (3)], minor differences exist depending on whether 

(t) Pr l\x(n) I ^ 7 1 '- 1 ] < jm - Pr C ,a:(n) I - yi ~ m ^ 
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or 

(ii) Pr l\x(n) | ^ 7 i_1/2 ] < jAr t ^ Pr [|z(n) | ^ y]- 

We consider only (ii), in which case: w'(w) = 2i«=><o(n) = i and 
6« = bi for all i^O. 

Repeating the arguments leading to (41) and (42) we have 

Pr. [A ^ CV?*<*+'- 2 >] ^ p/ ff^ x max [p«, • • • , !*+*+._,] (43) 
and 

[X I 2i-k-l 
-772) ' ^ 

Since p'(2«) = p(i), we have 

Pr. [A > CV?™>] * ^ [ -L ] -'- [ -i- ] ". (45) 

Comparison with (41) and (42) completes the demonstration. 

IV. TRANSIENT RESPONSE 

The preceding section discusses various aspects of the stationary 
distribution of the step size which effectively describes the steady-state 
behavior of the device. However, as stated before in Section I, the 
steady-state response is only of partial interest since the adaptability 
of the device is tied to quickness of response in the following situations : 

(i) Start up — we are forced to consider situations in which the 

initial step size is fairly arbitrary. 
(ii) Changes in the scaling of the input distribution — the scenario 
here is that the device has adapted to a particular intensity level 
(scaling) of the input distribution when a jump occurs to a new 
intensity level. 

In common with both situations, we have an initial step size and a 
waiting time for the step size to adapt to the desired step size. Recall 
that with k and I appropriately chosen, the desirable step size is the 
central step size, which corresponds to the state in the random walk, 
eq. (5). This aspect of the behavior of the device is also related to the 
rate at which the evolving step size distribution approaches the station- 
ary distribution. 

The main contribution of this section is the development of formulas 
for the efficient computation of the mean time required for the step 
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size to first reach the central step size for various values of the initial 
step size. The designer can use the information generated by the 
methods given here in the following manner. Assuming that the de- 
signer has some understanding of the rate of variation of the intensity 
level of the input distribution, he is in a position to determine the 
smallest value of 7 for which the adaptation algorithm adequately 
tracks the input process. The parameter y has to be made sufficiently 
large for the mean waiting time (time, of course, is used synonymously 
with number of transitions) for adaptation to be small compared to 
the changes in the location of the desired step size arising from changes 
in the intensity level. 

4.1 The mean time for first passage to the origin 

We will consider the random walk, eqs. (5) and (12), for the satur- 
ating adaptive quantizer since in the limit, as L becomes large, the 
functional obtained for this model yield corresponding quantities for 
the adaptive quantizer. Also, we shall consider only the case of the 
initial state w(0) > since the results obtained can be transferred to 
the case of negative initial states in a fairly obvious manner (see 
Remark (iv) of Section 3.3). 

Let the initial state cu(0) = i > and let M t denote the mean time 
required for the first occurrence of the event w(w) ^ 0. We observe 
that for all values of L, not necessarily finite, the time to first passage 
is finite with probability 1 as a consequence of the properties of recur- 
rence and irreducibility established earlier in Section II. If the first 
transition results in a decrease of the step size, the process continues 
as if the initial state has been (i — k). The conditional expectation of 
the first passage time, therefore, is Mi-k + 1. From this argument we 
deduce that the mean first passage time satisfies the recursion* 



Mi = bi(Mi- k + 1) + ai(M i+ i + 1) 

for (A; + 1) g i g (L - I). 



(46) 



The relation in (46) may be used to generate the entire sequence {Mi} 
provided the initial conditions are known. Now, by the same argument 
that led to (46), we have that (46) holds f or 1 ^ i ^ k with 

Mm = M 2 - k = . • • • = Mo = 0. 



* There is some similarity between (46) and the equations arising in gambler's 
ruin problems 16 and sequential analysis, 16 in which generally k = I = 1 and the transi- 
tion probabilities are not variable. 
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The remaining I boundary conditions, namely, 

M h M it ■■■,M l 

are hard to obtain and it is necessary to look more deeply into the 
dynamics of the process to obtain these quantities. 

For every sampling instant we define the L-dimensional vector z(n) 
with components Zj(n), 1 S j '< £ L, where 

Zj (n) = Pr [«(n) = j and u(s) ^ 1 for all s ^ n]. (47) 

These vectors, z(n), evolve with time according to 

z(n + 1) = Dz(n), n ^ 0. (48) 

These equations are given in Appendix B. Here we reproduce the 
structure of the L X L matrix D : 



D = 



b 



H-i 



b k 





a, 



fc+2 



a 2 



Ol-i Ol-j+i 





6 

0£ 



where 


[I- 


D>« 


= e. 



Putting together various properties of the matrix D and the random 
walk, we obtain, in Appendix B, the following result: for i ^ 1 



(49) 



and the elements of the vector e, are zero everywhere except at the 
z'th location where it is unity. In Appendix B it is shown that [I — D] 
is nonsingular. We observe parenthetically the virtue of the recursion 
given in (46) in that it allows us to generate rather easily all the If/a 
once the I inversions necessary to evaluate M h • • -, Afi are carried out. 
The matrix inversion in (49) may be viewed as a mixed boundary 
value problem with the first I and the final k equations providing the 
boundary conditions. The bulk of the elements of the vector x (,) 
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satisfy a recursion that was encountered previously in Section III: 

xf - b j+k xf +t + a*4&+ (50) 

Furthermore, we show in Appendix B that the elements xf are all 
nonnegative. Hence, we are in a position to usefully apply, even for 
infinite L, the techniques and results of Section III. 

First, we carry out the reduction of the equations as stated in 
Section 3.1 where the motivation for this step is discussed. We obtain 

£ to - t (1 - bj)x } ; I £ r * (L - ft). (51) 

j-r+i y=i— i+i 

The superscripts on the x's have not been used since (51) holds for 
all x<*>, 1 £ * £ I. 

One benefit of the above form is that it involves one less variable 
than the original recursion (50). In the important case of k = 1 and 
I ^ 1, this reduction is sufficient to transform the original mixed 
boundary value problem (49) to an initial value problem, i.e., the solu- 
tion to the matrix inversion problem (49) satisfies a recursion with 
specified initial conditions. Exact computation in this case becomes 
quite trivial. The details of this solution are given in Appendix C. 
Apart from its independent interest, this result is of particular interest 
in the adaptive quantizer when the distribution of the input sequence 
is Gaussian. As discussed previously, it is desirable to have in this 
case l/(k 4- I) = 0.68, and k = 1 and I = 2 will suffice. 

Another property of the solutions x w of (49) which holds for all L 
is that with increasing j, xf decreases at least geometrically. This con- 
clusion may be drawn from the bounds obtained in Section 3.3, eqs. 
(35) and (38). From the point of view of numerical inversion of 
[I — D] for large L, this is a critical property in that it is a necessary 
condition for most numerical techniques. The reader is referred to 
Richtmyer and Morton 17 for one such technique that we have used 
successfully and found to be efficient in that it effectively exploits the 
band structure of the matrix [I — D]. 

Finally, we remark that while we have dealt exclusively with first 
passage across the state it is clear that generalizations to first cross- 
ings across states other than the state is straightforward. 

4.2 Bound on the mean first passage time 

Two formulas, eqs. (46) and (49), have been given for computing 
the mean time required for the step size to adapt from an arbitrary 
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initial value to the desired, and also central, step size. However, by 
examining these formulas it is not easy to gain insights into the rate 
at which this adaptation time grows with the distance separating the 
two states and its dependence on y. Here, by probabilistic reasoning, we 
obtain an explicit upper bound on this time and this bound does pro- 
vide some insight. As we have done before, we consider here only the 
case of positive initial states, i.e., u(0) > 0. Let Ma, ^ i < j, denote 
the mean first passage time under the following conditions: the initial 
state g>(0) = j and first crossing occurs after r transitions if w(t) ^ i 
and w{n) > i for all n < r; then Af„ = E(t). In this notation the 
quantity M> defined in Section 4.1 is equivalent to M >. 
In Section 1.2, eq. (10), it is given that 



EWn + 1) | w (n) - <] - i = - (k+l) h t - g-^ j. 



(52) 



Denote the quantity on the right by — Si and observe that for i > 0, 
Si+i > Si > 0; hence, the supermartingale property. [For the saturating 
adaptive quantizer, the supermartingale property holds even more 
strongly, i.e., for i > 0, (52) holds with the equality replaced by ^.] 
In fact, the supermartingale property holds for the transformed 
process: u'(ri) = o>(n) 4- nS i+ i. i.e., 

tf [o/(n + 1) | co'(n)] ^ o/(rc) (53) 

for all w'(n) ^ (i -+■ 1) + nSi+i. For the crossing problem, (53) holds 
for all (n + 1) ^ t, the crossing time. We can now apply a theorem 
due to Doob 18 on optional stopping on supermartingales. In this case, 
the theorem states that 

EW{r)~\ ^ #[V(0)]. (54) 

Since 

(i 4- 1 - A:) 4- S i+1 E(r) =g Elu>'(r)2 ^ #O'(0)] = j, 

we obtain 



Ma = E(r) ^ J-[(j - 4- (fc - DI 



(55) 



We gain some insight on the role of y in determining the transient 
response of the device by observing the dependence of the above bound 
on y. Suppose we are interested in M j, the waiting time for the initial 
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step size A(0) = Cy 1 to reach the central step size C. Consider the 
effects of making y' = V7 on this waiting time (the multiplier coeffici- 
ents of the device are therefore 4y~ k and 4y l ). We let the prime 
superscript on symbols indicate a functional dependence on 7'. In 
establishing the new central step size [see eq. (3)], minor differences 
exist depending on whether 

(i) Pr [ I x(n) I ^ 7*- 1 ] < j^fi ^ ** C I x(n) \ ^ 7*" 1 ] 



or 



(ii) Pr[|s(n)| ^ 7 i_l ] < 



k + l 



g Pr[|rc(n)| ^ 7']. 



We consider only (ii), in which case the central step sizes are identical: 
«'(») = 2i <=> «(n) = i and b' 2i = &< for all i ^ 0. The waiting time 
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Fig. 3 — Transient response of the adaptive quantizer. 
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Fig. 4 — Transient response of the adaptive quantizer. 

for the step size to adapt from identical initial step size Cy j to final 
step size C is M' Q _ 2 j. From (55), 

*i* ^ !oi - (* - 1)]. 

Now, £ ^ *Si ^ flfjj hence, making y' = V7 and keeping /c and Z un- 
changed has the effect of making the bound on the waiting time at 
least twice as large for jy>k. This is a conclusion which is plausible 
in the light of the linear form of the bound (55) since the effect of 

making y' = V7 is to introduce twice as many transitions between the 
initial and final step sizes. 

4.3 Computational results 

We present here a sampling of our computational results. It is 
assumed that for every n, x(n) is normally distributed with unit 



ADAPTIVE QUANTIZER 891 



variance. The optimal step size A in this case has the property that 
Pr { \x(n) | ^ A} = 0.68. To center the stationary distribution of the 
step size close to the optimal step size, we choose k = 1 and I = 2. 

Figure 3 plots the mean time for first passage to the optimal step 
size vs. initial step size, and the initial step sizes chosen for this figure 
exceed the optimal step size. Various values of y(Mi = y~ k , M 2 = y l ) 
were used. Figure 4 provides the same information except that the 
horizontal axis corresponds to log™ A(0), rather than A(0) as in Fig. 3. 
The mean first passage times Mi and M 2 were obtained by the method 
outlined in Appendix C, and M ,-, i ^ 3 were generated by using the 
recursion in (46). To give some idea of the rate of convergence for 
xf\ eqs. (70) and (71), we tabulate some values of xf* for the case of 




M, =0.977 
M 2 = 1 .05 



0.5 
INITIAL STEP SIZE 



Fig. 5 — Transient response of the adaptive quantizer. 
892 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1974 



7= 1.1: 



x} 



1234567 8 9 10 

1.4 0.53 0.66 0.31 0.20 0.08 0.03 0.92 X 10" a 0.24 X 10"' 0.41 X 10" J 



A 



11 12 13 14 15 16 

0.59 X 10-* 0.53 X 10~ 6 0.35 X 10~« 0.13 X 10" 7 0.30 X 10~» 0.31 X 10" 11 



Figure 5 is similar to Fig. 3 except that here the initial step sizes are 
less than the optimal step size. Figure 6 plots the same information 
with log™ A(0), rather than A(0), on the horizontal axis. The mean 
first passage time Mi was obtained by solving (49) by the method 
given in Ref. 17 and all other first passage times were generated by the 
recursion in (46). 




Fig. 6 — Transient response of the adaptive quantizer. 
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APPENDIX A 
Proof ot Lemma 1 

Proof: 

(i) AT 1 being in the form of a companion matrix, the coefficients of 
the characteristic polynomial of the matrix are the elements of the 
first row: 

C( M ) = (-l^+'-'detCA^-Ml] 

= m a+j-i + . . . + M * _ [ aiM *-i + a 2 n k - 2 + • • • + a*], (56) 



where 



b i+ i b j+i+i _ bj+i+k-i /c 7 x 

a,- a< (*% 



By Descartes's rule the polynomial C{n) has at most one positive real 
root. Since C(0) = - <x k < and C(n) -* « as n -* «, there exists 
exactly one positive root. Let r denote this root. 

Now C(l) < if hi < (bi+i + b i+l+ i + • • • + bi+i+k-J. The latter 
condition holds for all t ^ 0. Hence, r > 1. 

(it) The left eigenvector 7> corresponding to the eigenvalue r satisfies, 

by definition, I'l*" 1 = A*. Examining the component equations we 
find that 

Xt . = \ x (l + r + • • • + r'" 1 ) 1 ^ t ^ Z. (58) 

Also, 

\i +k -< = ^[a,_ l+1 r«-^ + • • • + a k - X r + «*] 1 £ i i £ k. (59) 

Finally, r\i+k-i = a*Ai. Since the a's are positive quantities, the state- 
ment is clearly true. 
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(ii) The statement can be verified by inspecting the characteristic 
polynomial C{n) and using the fact that the coefficients a h ••■,01* 
each increase with i. 

APPENDIX B 

Derivation ot equations (48) and (49) 

The derivation of the equations governing the evolution of the 
vectors z(n) defined in eq. (47) proceeds as follows. For convenience, 
let X(ri) denote the event 1 ^ w(t) ^ L for all t, ^ r ^ n. Hence, 
by definition, 

Zj(n) = Pr [«(n) = j and I n ] 1 ^ j ^ L. 
Since 

gy(n) = Pr [w(n) = ; and X n -i] 

= L Pr [ w (n) = j|«(n - 1) = t, I^,>(b - 1) 

»■— 1 

r& fc+; - 2* +/ (n - 1), 1 £ y £ I, 
a s -i Zj-i{n - 1) + 6 y+fc 2 i+ ,(n-l), (I + 1) £ j £ (L - fc) 

a Hl Zi-i(n - 1), (L-fc + l)^il(L- 1), 

L 

£ a#,-(n - 1) j = L. 

i-L-l 

The above equations define the matrix D which relates z(n) to z(n — 1) 
as in eq. (48). 

For the derivation of eq. (49) we proceed as follows. For i = 1, 2, 
• •, L, let 

Fi(n + 1) = Pr [first passage occurs at (n + 1) |u(0) = i~\ 
= Pr [«(n + 1) £ 0, X„|a(0) = i] 

= £ &/ «i(n) with z(0) = e,. (60) 

The vector e, has every element equal to zero except for the ith element 
which is unity. To express eq. (60) in vector form we let b = [&i& 2 - ■ •&* 
• • • 0]'. Then, from (60), 

Fi(n + 1) = b'z(n) with z(0) = e,. 
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By definition, we have that the mean first passage time conditional on 
the initial state being i, 

Mi = E (n + l)Fi(n + 1) 
= b< E (n + l)z(n) 
= b' £ nz(w) + b< E z(n). (61) 

Now the second term in the above expression is unity since the proba- 
bility that passage occurs at finite time is unity. Now consider 

[I - D] E nz(n) = E nz(n) - E nz(n + 1) 

n&l n&l ngl 

= E z(n) - z(0). (62) 

Hence, denoting by 1 the column vector with every element equal to 
unity, we have from (62) that 

l'[I - D] E nz(n) = V E z(n) - 1 (63) 

n&l n^O 

= b< E nz(n), (64) 

since l'z(O) = 1 and b' = l'[I - D]. It only remains to consider 

E *(n) = [ E D'l z(0). 
71 so L >=o J 

The above series converges since every eigenvalue of the matrix D 
lies strictly within the unit circle in the complex plane. The proof of 
this follows from an old matrix theorem 19 which states that if the 
diagonal elements of the columns weakly dominate the sum of the 
absolute values of the off-diagonal elements with strong dominance 
holding for at least one column and the matrix is irreducible, then the 
determinant is nonzero. Applying this theorem to [D — XI], | X | ^ 1, 
we note that the irreducibility of the original Markov chain implies 
irreducibility of the matrix [D - XI] and that the weak column 
dominance property holds everywhere while the strong column 
dominance property holds for the first k columns. Hence, 

E z(n) = [ E D'l z(0) = [I - Dj-MO). (65) 
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Putting together the above results we have (49), namely, 
Mi = Z xf where [I - D>^ = e,. 

Observe that x (i) = 2z(?i) and, from the definition of z(n), it follows 
that every element of x (,) is nonnegative. 

APPENDIX C 

Mean first passage times tor the case k = 1, 1^1 

We have as our starting point eq. (49), namely, 

Mi = Z xf, (66) 



where [I — D]x (i) = e, 



(67) 



and we are interested only in 1 % i ^ 1. 

The transformation that was made in Section 3.1 is equivalent to 
the following: add to each row, r, of [I — DJ all rows r + 1, r + 2, ■ • • ; 
and do the same to the vector e,-. This operation makes the matrix 
[I — D] lower triangular, the reason being that with the exception of 
the first column, the elements of all other columns of [I — D] sum to 
zero. The resulting equations are as follows: the first component 
equation yields 

M° = 1, 



and the next (I — 1) equations: 2 S r ^ I, 

(I if 



r ^ % 



- E ap}* + brXr = < 



if r > i. 



Finally, 



xP = \ £ apf for r > I. 

Or j =t—1 



(68) 



(69) 



(70) 



The boundary conditions to the basic recursion in (70) are in (68) and 
(69) which are, of course, solvable: 



1 ^ r ^ i x™ = 1/ II bj 



y=i 



(i + 1) ^ r ^ I x r (i) = (xP - 1)/ II bj. 



(71) 
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